UPSTREAM PR #18334: webui: add MCP (Model Context Protocol) support#679
UPSTREAM PR #18334: webui: add MCP (Model Context Protocol) support#679
Conversation
Add JSON-RPC 2.0 type definitions and MCP server configuration structures for the Model Context Protocol implementation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add mcp_process class for spawning and managing MCP server subprocesses with bidirectional stdio communication. Handles process lifecycle, environment variables for unbuffered output, and cross-platform support. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add custom WebSocket server using raw sockets (no external library). Implements RFC 6455 handshake, frame parsing, masking, and message handling. Runs on HTTP port + 1 to avoid conflicts with httplib. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add server_mcp_bridge class that routes WebSocket messages to MCP server subprocesses. Manages per-connection state, configuration loading with hot-reload, and JSON-RPC 2.0 message forwarding. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Integrate MCP bridge and WebSocket server into main server: - Add --mcp-config CLI argument for configuration path - Add /mcp/servers and /mcp/ws-port HTTP endpoints - Register WebSocket event handlers for MCP - Update server-http to properly join thread on stop 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add TypeScript types for MCP protocol (JSON-RPC 2.0) and WebSocket service for communicating with MCP servers: - MCP types: tool definitions, JSON-RPC request/response/notification - McpService: WebSocket client with auto-reconnect and request timeout - API types: tool call interfaces for chat completions - Vite config: proxy WebSocket connections to MCP port - ESLint: allow underscore-prefixed unused args (common convention) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add reactive Svelte 5 stores for managing MCP state: - mcpStore: Global MCP connection state, tool discovery, tool calling - conversationMcpStore: Per-conversation MCP server enable/disable Uses SvelteMap/SvelteSet for proper Svelte 5 reactivity. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add components for displaying MCP tool calls and results: - ToolCallBlock: Collapsible display of tool call with arguments/results - ToolResultDisplay: Format and render tool execution results - tool-results.ts: Utility functions for parsing tool result messages 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add UI components for managing MCP server connections: - ChatFormActionMcp: Server selector dropdown in chat input - McpPanel: Full panel for viewing connected servers and tools 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Integrate MCP tool calling into the chat flow: - chat.ts: Add tool parameter injection and MCP tool execution - chat.svelte.ts: Track tool calls, results, and processing state - ChatMessageAssistant: Display tool calls with status and duration - ChatMessages: Build tool result map, filter tool result messages - ChatScreen: Wire up tool result event handlers - Add duration guard for negative timestamp differences 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add Python tests for MCP functionality: - test_mcp_servers_endpoint: Test /mcp/servers HTTP endpoint - test_mcp_ws_port_endpoint: Test /mcp/ws-port HTTP endpoint - test_mcp_initialize_handshake: Test MCP JSON-RPC initialization - test_mcp_tools_list: Test tools/list method - test_mcp_tool_call: Test tools/call method 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add documentation and example configuration for MCP: - README: Document MCP configuration, usage, and WebSocket port - mcp_config.example.json: Example config with filesystem and brave-search - Rebuild webui bundle with MCP support 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Force popover to open above (side="top") for consistent positioning - Search input at bottom (flips based on popover position) - Small solid dots for connection status (green/gray) - Hover row to reveal connect/disconnect action icons - Remove Connect All/Disconnect All footer buttons - Fix double X button in search input (hide native WebKit clear) - Add tooltips for status and actions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Don't show "Streaming..." status while arguments are being streamed. Only show "Calling tool..." when actually waiting for MCP server response. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Reorder assistant message layout so tool call blocks appear before the model badge and statistics. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove unused parameter names from MCP HTTP lambda handlers - Remove conditional websocket import (it's a required dependency) Fixes unused-parameter warning and pyright type-check errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Adds optional "cwd" field to mcp.json server configurations to set the working directory for stdio MCP servers. - Add cwd field to mcp_server_config struct - Unix: call chdir() before execvp() in child process - Windows: pass lpCurrentDirectory to CreateProcessA() - Update mcp_config.example.json with usage example 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Explore the complete analysis inside the Version Insights Performance Analysis Summary: PR #679 - MCP Support IntegrationOverviewThis PR introduces Model Context Protocol support to llama-server through new WebSocket infrastructure, subprocess management, and CLI argument extensions. The changes add approximately 1,200 ns to application startup through argument parsing modifications in Key FindingsArgument Parsing Impact: Inference Performance: Power Consumption: Code Changes: |
|
Explore the complete analysis inside the Version Insights Performance Analysis Summary: PR #679 MCP SupportOverviewThis PR adds Model Context Protocol (MCP) support to llama-server, introducing WebSocket infrastructure, subprocess management, and frontend UI components. The changes span 42 files with 5,857 additions and 127 deletions. Key FindingsPerformance-Critical Areas ImpactInference Pipeline Functions: Argument Parsing Degradation:
These functions handle CLI argument parsing during server initialization. The degradation affects startup time only, not runtime inference performance. The PR adds one new argument (--mcp-config) which contributes minimally to the existing systemic complexity in the argument parser infrastructure. Server Infrastructure Changes:
The server-http.cpp stop() method now includes thread.join() for proper cleanup, adding synchronization wait during shutdown only. Power Consumption AnalysisPower consumption changes are minimal across all binaries:
The power consumption variations are within measurement noise and do not indicate meaningful efficiency changes. Architecture ImpactThe PR introduces parallel execution paths for MCP tool calling that operate independently of the inference engine. WebSocket connections and MCP subprocesses run in separate threads, ensuring tool invocations do not block model inference. Memory overhead is approximately 50 KB base plus 25 KB per active MCP server connection. |
15838f1 to
006b713
Compare
07aff19 to
1f52e52
Compare
Mirrored from ggml-org/llama.cpp#18334
Summary
This PR adds MCP (Model Context Protocol) support to llama-server's web ui.
Servers using stdio transport only for now, with the server managing these processes for the frontend, which connects to them through a WebSocket per conversation per server.
Features
New CLI Option
Configuration Example
{ "mcpServers": { "brave-search": { "command": "npx", "args": ["-y", "@brave/brave-search-mcp-server", "--transport", "stdio"], "env": { "BRAVE_API_KEY": "... get your key at https://api.search.brave.com/app/keys ..." } }, "python": { "command": "uvx", "args": ["mcp-run-python", "--deps", "numpy,pandas,pydantic,requests,httpx,sympy,aiohttp", "stdio"], "env": {} } } }Architecture
server-ws.cpp/h- WebSocket server implementationserver-mcp-bridge.cpp/h- Routes WebSocket connections to MCP subprocessesserver-mproc.cpp/h- Cross-platform subprocess managementserver-mcp.h- MCP protocol type definitionsAPI Endpoints
GET /mcp/servers- List available MCP serversGET /mcp/ws-port- Get WebSocket port numberWS /mcp?server=<name>- WebSocket connection (on HTTP port + 1)Test plan
tools/server/tests/unit/test_mcp.py)@modelcontextprotocol/server-filesystem🤖 Generated with Claude Code